AITopics | deep classifier

UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs

Neural Information Processing SystemsDec-24-2025, 16:42:11 GMT

We present an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs). While most works in the literature that use GANs to generate out-of-distribution (OoD) examples only focus on the evaluation of OoD detection, we present a GAN based approach to learn a classifier that produces proper uncertainties for OoD examples as well as for false positives (FPs). Instead of shielding the entire in-distribution data with GAN generated OoD examples which is state-of-the-art, we shield each class separately with out-of-class examples generated by a conditional GAN and complement this with a one-vs-all image classifier. In our experiments, in particular on CIFAR10, CIFAR100 and Tiny ImageNet, we improve over the OoD detection and FP detection performance of state-of-the-art GAN-training based classifiers. Furthermore, we also find that the generated GAN examples do not significantly affect the calibration error of our classifier and result in a significant gain in model accuracy.

deep classifier, uncertainty quantification, unified model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Add feedback

4120362f930b0b683cd30b71af56fad1-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-14-2025, 10:18:24 GMT

dataset, hard imagenet, imagenet, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Maryland (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs

Neural Information Processing SystemsJan-17-2025, 03:32:36 GMT

We present an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs). While most works in the literature that use GANs to generate out-of-distribution (OoD) examples only focus on the evaluation of OoD detection, we present a GAN based approach to learn a classifier that produces proper uncertainties for OoD examples as well as for false positives (FPs). Instead of shielding the entire in-distribution data with GAN generated OoD examples which is state-of-the-art, we shield each class separately with out-of-class examples generated by a conditional GAN and complement this with a one-vs-all image classifier. In our experiments, in particular on CIFAR10, CIFAR100 and Tiny ImageNet, we improve over the OoD detection and FP detection performance of state-of-the-art GAN-training based classifiers. Furthermore, we also find that the generated GAN examples do not significantly affect the calibration error of our classifier and result in a significant gain in model accuracy.

conditional gan, deep classifier, uncertainty quantification, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

New insights into training dynamics of deep classifiers

#artificialintelligenceMar-9-2023, 12:51:35 GMT

A new study from researchers at MIT and Brown University characterizes several properties that emerge during the training of deep classifiers, a type of artificial neural network commonly used for classification tasks such as image classification, speech recognition, and natural language processing. The paper, "Dynamics in Deep Classifiers trained with the Square Loss: Normalization, Low Rank, Neural Collapse and Generalization Bounds," published today in the journal Research, is the first of its kind to theoretically explore the dynamics of training deep classifiers with the square loss and how properties such as rank minimization, neural collapse, and dualities between the activation of neurons and the weights of the layers are intertwined. In the study, the authors focused on two types of deep classifiers: fully connected deep networks and convolutional neural networks (CNNs). A previous study examined the structural properties that develop in large neural networks at the final stages of training. That study focused on the last layer of the network and found that deep networks trained to fit a training dataset will eventually reach a state known as "neural collapse." When neural collapse occurs, the network maps multiple examples of a particular class (such as images of cats) to a single template of that class.

deep classifier, generalization, neural collapse, (14 more...)

#artificialintelligence

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)

Genre:

Research Report > Experimental Study (0.36)
Research Report > New Finding (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

An Incremental Gray-box Physical Adversarial Attack on Neural Network Training

Al-qudah, Rabiah, Aloqaily, Moayad, Ouni, Bassem, Guizani, Mohsen, Lestable, Thierry

arXiv.org Artificial IntelligenceFeb-20-2023

Neural networks have demonstrated remarkable success in learning and solving complex tasks in a variety of fields. Nevertheless, the rise of those networks in modern computing has been accompanied by concerns regarding their vulnerability to adversarial attacks. In this work, we propose a novel gradient-free, gray box, incremental attack that targets the training process of neural networks. The proposed attack, which implicitly poisons the intermediate data structures that retain the training instances between training epochs acquires its high-risk property from attacking data structures that are typically unobserved by professionals. Hence, the attack goes unnoticed despite the damage it can cause. Moreover, the attack can be executed without the attackers' knowledge of the neural network structure or training data making it more dangerous. The attack was tested under a sensitive application of secure cognitive cities, namely, biometric authentication. The conducted experiments showed that the proposed attack is effective and stealthy. Finally, the attack effectiveness property was concluded from the fact that it was able to flip the sign of the loss gradient in the conducted experiments to become positive, which indicated noisy and unstable training. Moreover, the attack was able to decrease the inference probability in the poisoned networks compared to their unpoisoned counterparts by 15.37%, 14.68%, and 24.88% for the Densenet, VGG, and Xception, respectively. Finally, the attack retained its stealthiness despite its high effectiveness. This was demonstrated by the fact that the attack did not cause a notable increase in the training time, in addition, the Fscore values only dropped by an average of 1.2%, 1.9%, and 1.5% for the poisoned Densenet, VGG, and Xception, respectively.

artificial intelligence, experiment, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.01245

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > West Virginia (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

BERT-LID: Leveraging BERT to Improve Spoken Language Identification

Nie, Yuting, Zhao, Junhong, Zhang, Wei-Qiang, Bai, Jinfeng

arXiv.org Artificial IntelligenceOct-11-2022

Language identification is the task of automatically determining the identity of a language conveyed by a spoken segment. It has a profound impact on the multilingual interoperability of an intelligent speech system. Despite language identification attaining high accuracy on medium or long utterances(>3s), the performance on short utterances (<=1s) is still far from satisfactory. We propose a BERT-based language identification system (BERT-LID) to improve language identification performance, especially on short-duration speech segments. We extend the original BERT model by taking the phonetic posteriorgrams (PPG) derived from the front-end phone recognizer as input. Then we deployed the optimal deep classifier followed by it for language identification. Our BERT-LID model can improve the baseline accuracy by about 6.5% on long-segment identification and 19.9% on short-segment identification, demonstrating our BERT-LID's effectiveness to language identification.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2203.00328

Country:

Asia > China (0.05)
North America > United States > Colorado (0.04)
Asia > East Asia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)

Add feedback

Deep Classifiers with Label Noise Modeling and Distance Awareness

Fortuin, Vincent, Collier, Mark, Wenzel, Florian, Allingham, James, Liu, Jeremiah, Tran, Dustin, Lakshminarayanan, Balaji, Berent, Jesse, Jenatton, Rodolphe, Kokiopoulou, Effrosyni

arXiv.org Machine LearningOct-6-2021

Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distribution calibration, both of these types of uncertainty are often necessary. In this work, we propose the HetSNGP method for jointly modeling the model and data uncertainty. We show that our proposed model affords a favorable combination between these two complementary types of uncertainty and thus outperforms the baseline methods on some challenging out-of-distribution datasets, including CIFAR-100C, Imagenet-C, and Imagenet-A. Moreover, we propose HetSNGP Ensemble, an ensembled version of our method which adds an additional type of uncertainty and also outperforms other ensemble baselines.

arxiv preprint arxiv, dataset, model uncertainty, (13 more...)

arXiv.org Machine Learning

2110.02609

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Reinforcing Medical Image Classifier to Improve Generalization on Small Datasets

Al, Walid Abdullah, Yun, Il Dong

arXiv.org Machine LearningSep-2-2019

With the advents of deep learning, improved image classification with complex discriminative models has been made possible. However, such deep models with increased complexity require a huge set of labeled samples to generalize the training. Such classification models can easily overfit when applied for medical images because of limited training data, which is a common problem in the field of medical image analysis. This paper proposes and investigates a reinforced classifier for improving the generalization under a few available training data. Partially following the idea of reinforcement learning, the proposed classifier uses a generalization-feedback from a subset of the training data to update its parameter instead of only using the conventional cross-entropy loss about the training data. We evaluate the improvement of the proposed classifier by applying it on three different classification problems against the standard deep classifiers equipped with existing overfitting-prevention techniques. Besides an overall improvement in classification performance, the proposed classifier showed remarkable characteristics of generalized learning, which can have great potential in medical classification tasks.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Machine Learning

1909.0563

Genre: Research Report > New Finding (0.46)

Industry: